Create single node control plane based on installer bootstrap#440
Create single node control plane based on installer bootstrap#440eranco74 wants to merge 4 commits intoopenshift:masterfrom
Conversation
|
|
||
| ### Non-Goals | ||
|
|
||
| 1. Single ignition config that can be used for multiple clusters |
There was a problem hiding this comment.
I suspect that many will see this as valuable enough to build in from the start because of latency issues, but @crawford probably knows better.
|
|
||
| Demonstrate a prototype of creating a simple static Ignition file that boots an RHCOS machine and launches a basic Kube control plane | ||
|
|
||
| ### Goals |
There was a problem hiding this comment.
I think goals need to enumerate the components we want running. I'll throw one possible starting point out:
- etcd
- kube-apiserver
- kube-controller-manager
- kube-scheduler
- oauth-apiserver
- oauth-server
- olm
- nothing else.
This gives a kube control plane.
There was a problem hiding this comment.
@eranco74 lets list what is provided by bootstrap static pods.
There was a problem hiding this comment.
On the bootstrap node we have these static pods yamls:
- etcd-member-pod.yaml
- kube-apiserver-pod.yaml
- kube-controller-manager-pod.yaml
- kube-scheduler-pod.yaml
- bootstrap-pod.yaml (cluster version operator)
- recycler-pod.yaml (doesn't seem relevant)
Running containers:
crictl ps | awk '{print $7}'
POD
kube-apiserver-insecure-readyz
kube-apiserver
kube-controller-manager
kube-scheduler
cluster-version-operator
etcd-metrics
etcd-member
Pods that show up with kubectl:
kubectl --kubeconfig auth/kubeconfig get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system bootstrap-kube-apiserver-master1 2/2 Running 0 37m
kube-system bootstrap-kube-controller-manager-master1 1/1 Running 0 37m
kube-system bootstrap-kube-scheduler-master1 1/1 Running 0 37m
We also have machineconfigoperator-bootstrap-pod.yaml that runs machine-config-server (get removed once the bootstrap manage to apply all the manifests https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootkube.sh.template#L371)
There was a problem hiding this comment.
These are the static pods manifests we have in (baremetal) Openshift master node (3 nodes installation):
1. etcd-pod.yaml
1. kube-apiserver-pod.yaml
2. kube-controller-manager-pod.yaml
3. kube-scheduler-pod.yaml
4. coredns.yaml
5. haproxy.yaml
6. keepalived.yaml
7. mdns-publisher.yaml
8. recycler-pod.yaml
There was a problem hiding this comment.
the keeplived/coredns/haproxy are there because u're looking into BM platfrom cluster. I guess when we try similar with none, static pods will be aligned.
|
|
||
| 1. Create a single node cluster composed of static pods, similar to the installer bootstrap. | ||
|
|
||
| ### Non-Goals |
There was a problem hiding this comment.
- running a single node means certain management activity is difficult/impossible with current operator design. We should indicate whether we want operators running and if so, for which parts
- should indicate whether or not this needs to be able to upgrade. again @crawford
|
|
||
| Initial POC - https://docs.google.com/document/d/1pWauEQXl__39fMeLBIQpPnBNXdd92JNOylAnk8LCW_M/edit?usp=sharing | ||
|
|
||
| All certificates will be generated by the openshift installer. |
There was a problem hiding this comment.
I think the installer wants to stop creating certificates and would prefer to delegate to the operators themselves in rendering.
|
|
||
| 1. Create a single node cluster composed of static pods, similar to the installer bootstrap. | ||
|
|
||
| ### Non-Goals |
There was a problem hiding this comment.
- decide whether or not cert rotation is important. If not, and if we decide to produce these static pods, it is possible for us to choose a different expiry, measured in years.
There was a problem hiding this comment.
I guess we can start with 10 years validity and add the rotation later
|
|
||
| 1. Operators are not running in the cluster and we need a way to rotate all certificates with a bash script using oc similar to this: | ||
| https://github.com/code-ready/snc/blame/master/kubelet-bootstrap-cred-manager-ds.yaml.in | ||
| Is this the best way to handle it? |
There was a problem hiding this comment.
What if we decide we don't rotate, but that clusters created in this mode create certificates good for X years instead of X days
There was a problem hiding this comment.
Than we'll need to update all the parts that generate those certificates, right? There is no single place ?
|
|
||
| ### Implementation Details/Notes/Constraints [optional] | ||
|
|
||
| Initial POC - https://docs.google.com/document/d/1pWauEQXl__39fMeLBIQpPnBNXdd92JNOylAnk8LCW_M/edit?usp=sharing |
There was a problem hiding this comment.
Explode a high level flow here and whether or not it worked?
I recall suggesting that you
- install a cluster
- remove two nodes
- run an etcd recovery on the remaining node to get a good etcd
- see if mostly works
If that mostly works, then I think we can talk about a possible path forward where the full configuration input is provided in manifests and operators render out the "finished" static pod instead of a bootstrap static pod. Or something similar. @crawford again.
There was a problem hiding this comment.
Cluster downscale POC:
The cluster seems OK except for:
openshift-ingress router:
1/2 in status Pending since it's configured with 2 replicas and we have a single node.
Etcd-quorum-guard:
2/3 in status Pending since it’s configured with 3 replicas and we have a single node.
We ran openshift conformance tests (Feature:ProjectAPI) on the single node as well (6 pass, 0 skip (48.2s))
We did another POC transforming the installer bootsrap node to a single node cluster (replaced the link with the POC details)
Main changes are: - Put a little more emphasis on describing the installer interface - Add more details to the summary and motivation section - Mention the non-goal of being able to expand this cluster - Mention we want to support users customizing this all-in-one config - Copy the POC details from the Google doc so they are publicly visible - Add an open question about whether the bootstrap static pods are suitable
| ## Proposal | ||
|
|
||
| When a machine is booted with aio.ign, the aiokube systemd service is | ||
| launched (similar to bootkube in the bootstrap ignition). |
|
I'm wondering whether API-VIP is required (instead of relying on external dns) |
eca5199 to
bec93b0
Compare
| - "@markmc" | ||
| creation-date: yyyy-mm-dd | ||
| last-updated: yyyy-mm-dd | ||
| status: provisional|implementable|implemented|deferred|rejected|withdrawn|replaced |
There was a problem hiding this comment.
nit: fill in the dates and pick a status?
|
|
||
| # Single node installation | ||
|
|
||
| Add a new `create aio-config` command to `openshift-installer` which |
There was a problem hiding this comment.
I am not wild about abbreviated command names, although I am ~ok with shorter aliases. I'd rather address long-command-name concerns with auto-complete scripts ;). Can we make this create single-node-config or some such?
There was a problem hiding this comment.
Hmm, it's also not clear to me why you can't just use the existing create ignition-configs with an install-config.yaml requesting replicas: 1 for the control plane and replicas: 0 for compute. Why does this need a new subcommand?
There was a problem hiding this comment.
The new sub-command is required since we want a new installation flow that allow installing the node without having an auxiliary node (bootstrap) just an rhcos + ignition.
| replaces: | ||
| - "/enhancements/that-less-than-great-idea.md" | ||
| superseded-by: | ||
| - "/enhancements/our-past-effort.md" |
There was a problem hiding this comment.
nit: remove replaces and superseded-by unless you have more to put in them than the dummy placeholders.
Renamed `create aio-config` to `create single-node-config`
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: eranco74 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| # Single node installation | ||
|
|
||
| Add a new `create single-node-config` command to `openshift-installer` which | ||
| allows a user to create an `aio.ign` Ignition configuration which |
There was a problem hiding this comment.
what does aio signify?
| https://github.com/openshift/enhancements/pull/302 | ||
|
|
||
| ## Motivation | ||
|
|
There was a problem hiding this comment.
This feels like it's missing the use-case? Is it just for demoing something - I'm not really clear on the why for this...
There was a problem hiding this comment.
This is a first step for zero touch single node cluster.
This enhancement describes a new single-node cluster profile for production use in "edge" deployments that are not considered to be resource-constrained, such as telecommunications bare metal environments. Signed-off-by: Doug Hellmann <dhellmann@redhat.com>
|
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
|
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
|
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
|
@openshift-bot: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
No description provided.